A Pipeline for Identifying Integration Sites of Mobile Elements in the Genome Using Next-Generation Sequencing
نویسندگان
چکیده
Next Generation Sequencing (NGS) reads obtained by sequencing of the junction of a mobile element and the host flanking region from individuals in a population are typically mapped to a reference genome to determine the location of the mobile element-host junction. We propose a clustering pipeline for grouping such NGS data into clusters corresponding to the locations of integration sites in the genome. Our pipeline relies on the UCLUST clustering software, which clusters reads into groups using a clustering threshold, to cluster the integration sites NGS reads into groups based on their site of origin. An optimal clustering threshold is chosen based on a proposed clustering measure, I − index. We evaluate our pipeline on simulated integration sites data from the human genome and compare its performance to UCLUST clustering. Our pipeline is more accurate in recovering both the number and the correct sequence of the integration sites when compared to the other method. This pipeline can be beneficial in detecting the mobile element-host junctions in a population for species with no reference genome.
منابع مشابه
I-37: Establishing High Resolution Genomic Profiles of Single Cells Using Microarray and Next-Generation Sequencing Technologies
The nature and pace of genome mutation is largely unknown. Standard methods to investigate DNA-mutation rely on arraying or sequencing DNA from a population of cells, hence the genetic composition of individual cells is lost and de novo mutation in cell(s) is concealed within the bulk signal. We developed methods based on (SNP-) arraying and next-generation sequencing of single-cell whole-genom...
متن کاملGenome Wide Association Studies, Next Generation Sequencing and Their Application in Animal Breeding and Genetics: A Review
Recently genetic studies have been revolutionized by next generation sequencing (NGS) technology, and it is expected that the use of this technology will largely eliminate defects in the methods of association studies. The NGS technology is becoming the premier tool in genetics. However, at the moment the use of this method is limited especially in the livestock due to high cost and computation...
متن کاملStrategies and Clinical Applications of Next Generation Sequencing
Abstract DNA sequencing is one of the great valuable techniques in molecular biology, which can be used to detect the sequence of nucleotides in a DNA fragment. The high-throughput sequencing known as Next Generation Sequencing (NGS) revolutionized genomic research and molecular biology; therefore, the whole human genome can be sequenced with a low cost in several days. NGS technology is simi...
متن کاملIdentifying insertion mutations by whole-genome sequencing.
Insertion mutagenesis via mobile genetic element is a common technique for the analysis of gene function in model organisms. Next-generation sequencing offers an attractive approach for localizing the site of insertion, but alignment-based mapping of mobile genetic elements is challenging. A computational method for identifying insertion sites is reported herein. The technique was validated by ...
متن کاملStrategies and Clinical Applications of Next Generation Sequencing
Abstract DNA sequencing is one of the great valuable techniques in molecular biology, which can be used to detect the sequence of nucleotides in a DNA fragment. The high-throughput sequencing known as Next Generation Sequencing (NGS) revolutionized genomic research and molecular biology; therefore, the whole human genome can be sequenced with a low cost in several days. NGS technology is simi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016